---
title: 'Individual differences in nonverbal prediction and vocabulary size in infancy'
author: "Tracy Reuter, Lauren Emberson, Alexa Romberg, and Casey Lew-Williams"
output: pdf_document
toc: yes
toc_depth: 1
fontsize: 12pt
fig_height: 4
fig_width: 7
---

```{r, echo=FALSE, warning=FALSE, message=FALSE}
# set working directory
setwd("~/Desktop/supplementary.materials")

# load libraries
library(doBy)
library(ggplot2)
library(tidyr)
library(plyr)
library(RColorBrewer)
library(knitr)
library(extrafont)

# read in data
FIPdata <- read.csv("nonverbal.prediction.task.data.csv", header=T)
```
\newpage

# Nonverbal Prediction Task

We hypothesized that infants with larger vocabularies would make more predictions and would update predictions more readily than infants with smaller vocabularies. To evaluate these hypotheses, we correlated infants’ productive vocabulary percentile (MCDI) and anticipatory eye movements (AEMs) during the prediction task. We excluded Trial 1 from all analyses, as there was no basis for infants to generate predictions therein. We also compared high-vocabulary and low-vocabulary groups, based on a median split in MCDI scaled scores. Finally, we analyzed visual processing speed to evaluate alternative explanations for individual variability in infants’ performance on the prediction task.
\newpage

## AEMs in Both Blocks

To evaluate whether infants with larger vocabularies generated more predictions overall (regardless of accuracy), we first correlated infants’ vocabulary size with their proportion of total trials with an AEM to either location. There was striking variation in both vocabulary size (MCDI percentile range=5-99, M=42, SD=28) and proportion of trials with an AEM (range=0-100, M=55, SD=29). Including all infants, regardless of how many AEMs they generated, we found no correlation between infants’ vocabulary size and overall proportion of trials with an AEM [r(48)=-0.12, p=0.39]. Examining each Block separately, there was no correlation in Block 1 [r(48)=-0.03, p=0.82] nor in Block 2 [r(47)=-0.17, p=0.23]. Thus, contrary to our first hypothesis, infants with relatively larger vocabularies were not more likely to anticipate the target’s appearance.
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# summarize data by proportion of AEMs
corr.data <- summaryBy(Words.Says.Scaled + AEM.Any ~ Sub.Num, FUN=c(mean), data=FIPdata)

# clean up headers and plot
colnames(corr.data)[1:3] <- c("Subject","MCDI","Any.AEM")
ggplot(data=corr.data, aes(x=MCDI, y=Any.AEM)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-.1,1.1), breaks=seq(0, 1, .1), name="Proportion of Trials\nwith an AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$Any.AEM) #n.s.

# summarize data by count of AEMs
MCDI.data <- summaryBy(Words.Says.Scaled ~ Sub.Num, FUN=c(mean), data=FIPdata)
AEM.data <- summaryBy(AEM.Any ~ Sub.Num + Tr.Num, FUN=c(sum), data=FIPdata)
AEM.data$AEM <- ifelse(AEM.data$AEM.Any.sum >0,1,0)
AEM.data <- summaryBy(AEM ~ Sub.Num, FUN=c(sum), data=AEM.data)
corr.data <- merge(MCDI.data, AEM.data)

# clean up headers and plot
colnames(corr.data)[1:3] <- c("Subject","MCDI","AEM.count")
ggplot(data=corr.data, aes(x=MCDI, y=AEM.count)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-1,16), breaks=seq(0,15,1), name="Number of Trials\nwith an AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$AEM.count) #n.s.
```
\newpage

## AEMs in Block 1
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# summarize data with proportion of AEMs in Block 1
corr.data <- summaryBy(Words.Says.Scaled + AEM.Any ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 1"))

# clean up headers and plot
colnames(corr.data)[1:3] <- c("Subject","MCDI","Any.AEM")
ggplot(data=corr.data, aes(x=MCDI, y=Any.AEM)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-.1,1.1), breaks=seq(0, 1, .1), name="Proportion of Trials\nwith an AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$Any.AEM) #n.s.

# summarize data by count of AEMs in Block 1
MCDI.data <- summaryBy(Words.Says.Scaled ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 1"))
AEM.data <- summaryBy(AEM.Any ~ Sub.Num + Tr.Num, FUN=c(sum), data=subset(FIPdata, FIPdata$Block=="Block 1"))
AEM.data$AEM <- ifelse(AEM.data$AEM.Any.sum >0,1,0)
AEM.data <- summaryBy(AEM ~ Sub.Num, FUN=c(sum), data=AEM.data)
corr.data <- merge(MCDI.data, AEM.data)

# clean up headers and plot
colnames(corr.data)[1:3] <- c("Subject","MCDI","AEM.count")
ggplot(data=corr.data, aes(x=MCDI, y=AEM.count)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-1,8), breaks=seq(0,7,1), name="Number of Trials\nwith an AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$AEM.count) #n.s.
```
\newpage

## AEMs by Trial in Block 1

We also examined AEMs trial-by-trial in Block 1. We find that high-MCDI infants do not generate predictions faster in the learning process than low-MCDI infants. Again, this suggests both groups learned equally well in the first block. The rate of AEMs from trial to trial also suggested that this task may have been too easy for infants. On each trial, about 50% of infants were generating predictions, with very little, if any, increase in slope. (An increase in slope might indicate learning - that infants wait for more information before generating predictions.)
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# summarize data
FIPdatasummary <- summaryBy(AEM.T1 ~ MCDI.Group + Sub.Num + Trial.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block == "Block 1"))
FIPdatasummary$AnyAEM <- FIPdatasummary$AEM.T1.mean
FIPdatasummary$AEM.T1.mean <- NULL
FIPdatasummary <- FIPdatasummary[!is.na(FIPdatasummary$MCDI.Group), ]

# calculate mean and standard error for plots
AnyAEM.mean <- summaryBy(AnyAEM ~ MCDI.Group + Trial.Num, FUN=c(mean), data=FIPdatasummary)
AnyAEM.var <- summaryBy(AnyAEM ~ MCDI.Group + Trial.Num, FUN=c(var), data=FIPdatasummary)
AnyAEM.length <- summaryBy(AnyAEM ~ MCDI.Group + Trial.Num, FUN=c(length), data=FIPdatasummary)
AnyAEM.se <- sqrt((AnyAEM.var$AnyAEM.var)/(AnyAEM.length$AnyAEM.length))
AnyAEM.se <- data.frame(AnyAEM.se)

plot.data <- cbind(AnyAEM.mean, AnyAEM.se)
plot.data$upper <- plot.data$AnyAEM.mean + plot.data$AnyAEM.se
plot.data$lower <- plot.data$AnyAEM.mean - plot.data$AnyAEM.se
colnames(plot.data)[1:3] <- c("MCDI.Group","Trial.Number","Proportion.Correct.AEMs")

# plot
ggplot(data=plot.data, aes(x=Trial.Number, y=Proportion.Correct.AEMs, color=MCDI.Group)) + 
  scale_color_grey() +
  geom_point(stat="identity", size=5) +
  geom_errorbar(aes(ymin=lower, ymax=upper), width=.1) +
  geom_smooth(aes(x=Trial.Number, y=Proportion.Correct.AEMs, color=MCDI.Group), method="lm", se=FALSE, fullrange=TRUE) +
  scale_x_continuous(limits=c(1.9,8.1), breaks=seq(2,8,1), name="Trial Number") +
  scale_y_continuous(limits=c(-.01,1.01), breaks=seq(0,1,.1), name="Proportion of Infants\nMaking an AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black')) +
  theme(legend.title = element_text()) +
  theme(legend.position = "bottom")
```
\newpage

## AEMs in Block 2

MCDI does not correlate with AEMs (to either target location - novel or familiar) in Block 2. (This analysis excludes Trial 9, as there is no basis to generate a prediction towards the novel target location yet.)
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# summarize data with proportion of AEMs in Block 2
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Any ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 2" & FIPdata$Trial.Num!=9))

# clean up headers and plot
colnames(corr.data)[1:4] <- c("Subject","MCDI","Age.Months","Any.AEM")
ggplot(data=corr.data, aes(x=MCDI, y=Any.AEM)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-.1,1.1), breaks=seq(0, 1, .1), name="Proportion of Trials\nwith an AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$Any.AEM) #n.s.

# summarize data by count of AEMs in Block 2
MCDI.data <- summaryBy(Words.Says.Scaled + Age ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 2" & FIPdata$Trial.Num!=9))
AEM.data <- summaryBy(AEM.Any ~ Sub.Num + Tr.Num, FUN=c(sum), data=subset(FIPdata, FIPdata$Block=="Block 2" & FIPdata$Trial.Num!=9))
AEM.data$AEM <- ifelse(AEM.data$AEM.Any.sum >0,1,0)
AEM.data <- summaryBy(AEM ~ Sub.Num, FUN=c(sum), data=AEM.data)
corr.data <- merge(MCDI.data, AEM.data)

# clean up headers and plot
colnames(corr.data)[1:4] <- c("Subject","MCDI","Age.Months","AEM.count")
ggplot(data=corr.data, aes(x=MCDI, y=AEM.count)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-1,8), breaks=seq(0,7,1), name="Number of Trials\nwith an AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$AEM.count) #n.s.
```
\newpage

## Updating AEMs in Block 2

To test our second hypothesis – that infants with larger vocabularies would update predictions more readily – we classified AEMs in Block 2 as directed toward the novel location (i.e., the Block 2 target location) or the familiar location (i.e., the Block 1 target location). Trial 9 was excluded from this analysis, because this was the first trial in which the target appeared in the novel location. Thus, infants had no basis for directing an AEM to the novel target location during Trial 9. Infants who did not make any AEMs (n=8) or who did not have any code-able trials in Block 2 (n=1) were not included in this analysis. For each infant, we calculated the proportion of their AEMs that were to the novel location (i.e., novel AEMs divided by total AEMs). We found a significant correlation between infants’ vocabulary size and the proportion of their AEMs to the novel location [r(39)=0.34, p=0.028]. Consistent with our second hypothesis, when making an AEM in Block 2, infants with larger vocabularies were more likely to make an AEM to the novel target location than infants with smaller vocabularies.
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=3, fig.align='center'}
# summarize data - Novel/(Novel+Familiar)
corr.data <- summaryBy(Words.Says.Scaled + AEM.Tn ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 2" & AEM.Any==1 & FIPdata$Trial.Num!=9))

# clean up headers and plot
colnames(corr.data)[1:3] <- c("Subject","MCDI","AEM.ratio")
ggplot(data=corr.data, aes(x=MCDI, y=AEM.ratio)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter", size=5) +
  scale_x_continuous(limits=c(-.01,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-.01,1.01), breaks=seq(0, 1, .1), name="Proportion of AEMs\nto Novel Location") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black')) +
  theme(legend.position = c(.8,.2))

cor.test(corr.data$MCDI, corr.data$AEM.ratio)
```
\newpage

## Updating AEMs by Trial in Block 2

To examine the time-course of prediction updating, we analyzed AEMs trial by trial in Block 2. We divided infants into high-vocabulary and low-vocabulary groups based on a median split in MCDI percentile scores (high vocabulary: n=22, M=64, SD=19; low vocabulary: n=19, M=17, SD=9). Then, on each trial we calculated the proportion of AEMs to the novel location as the number of infants who made an AEM to the novel location divided by the number of infants who made AEMS to either location. A binomial test was used to test this proportion against a chance value of 0.5. The low-vocabulary group only performed above chance on trial 16 (p=0.039), making significantly more AEMs to the novel than the familiar location. The high-vocabulary group was marginally above chance in trial 13 (p=0.092), above chance in trial 14 (p=0.006), and above chance in trial 16 (p=0.039).
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# # summarize data for descriptive statistics
# data.summary <- summaryBy(Words.Says.Scaled ~ MCDI.Group + Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 2" & FIPdata$AEM.Any==1))
# 
# # report descriptive statistics
# high.mcdi <- subset(data.summary, data.summary$MCDI.Group=="High Vocabulary")
# high.mcdi.n <- length(high.mcdi$Sub.Num)
# high.mcdi.n
# high.mcdi.mean <- mean(high.mcdi$Words.Says.Scaled)
# high.mcdi.mean
# high.mcdi.sd <- sd(high.mcdi$Words.Says.Scaled)
# high.mcdi.sd
# 
# low.mcdi <- subset(data.summary, data.summary$MCDI.Group=="Low Vocabulary")
# low.mcdi.mean <- mean(low.mcdi$Words.Says.Scaled)
# low.mcdi.n <- length(low.mcdi$Sub.Num)
# low.mcdi.n
# low.mcdi.mean
# low.mcdi.sd <- sd(low.mcdi$Words.Says.Scaled)
# low.mcdi.sd

# summarize data
data.summary <- summaryBy(AEM.Tn ~ MCDI.Group + Sub.Num + Trial.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 2" & FIPdata$Trial.Num !=9 & FIPdata$AEM.Any==1))
data.summary$ratio <- data.summary$AEM.Tn.mean

# calculate mean and standard error for plot
ratio.mean <- summaryBy(ratio ~ MCDI.Group + Trial.Num, FUN=c(mean), data=data.summary)
ratio.var <- summaryBy(ratio ~ MCDI.Group + Trial.Num, FUN=c(var), data=data.summary)
ratio.length <- summaryBy(ratio ~ MCDI.Group + Trial.Num, FUN=c(length), data=data.summary)
ratio.se <- sqrt((ratio.var$ratio.var)/(ratio.length$ratio.length))
ratio.se <- data.frame(ratio.se)

plot.data <- cbind(ratio.mean, ratio.se)
plot.data$upper <- plot.data$ratio.mean + plot.data$ratio.se
plot.data$lower <- plot.data$ratio.mean - plot.data$ratio.se
colnames(plot.data)[1:3] <- c("MCDI.Group","Trial.Number","AEM.ratio")
plot.data$MCDI.Group.Factor <- factor(plot.data$MCDI.Group, levels=c('Low Vocabulary','High Vocabulary'))

# plot
ggplot(data=plot.data, aes(x=Trial.Number, y=AEM.ratio)) + 
  facet_grid(. ~ MCDI.Group.Factor) +
  geom_point(stat="identity", color="black", size=5) +
  geom_errorbar(aes(ymin=lower, ymax=upper), width=.1) +
  geom_smooth(aes(x=Trial.Number, y=AEM.ratio, color=MCDI.Group), method="lm", color="black", se=FALSE, fullrange=TRUE) +
  scale_x_continuous(limits=c(9.5,16.5), breaks=seq(10,16,1), name="Trial Number") +
  scale_y_continuous(limits=c(-.01,1.01), breaks=seq(0, 1, .1), name="Proportion of AEMs\nto Novel Location") +
  geom_hline(aes(yintercept=.5), linetype='dashed', alpha=1) + # chance
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black')) +
  theme(legend.position = "none")

# # run binom.test for each trial
# # binom.test(x=numbercorrect, n=totaltrials, p=.5, conf.level=.95)
# binom.test(x=3, n=14, p=.5, conf.level=.95) # Low-MCDI kids are marginally below chance at trial 10 #p-value = 0.05737
# binom.test(x=5, n=10, p=.5, conf.level=.95) # Low-MCDI kids are at chance at trial 11 #p-value = 1
# binom.test(x=5, n=12, p=.5, conf.level=.95) # Low-MCDI kids are at chance at trial 12 #p-value = 0.7744
# binom.test(x=5, n=10, p=.5, conf.level=.95) # Low-MCDI kids are at chance at trial 13 #p-value = 1
# binom.test(x=6, n=9, p=.5, conf.level=.95) # Low-MCDI kids are at chance at trial 14 #p-value = 0.5078
# binom.test(x=6, n=9, p=.5, conf.level=.95) # Low-MCDI kids are at chance at trial 15 #p-value = 0.5078
# binom.test(x=8, n=9, p=.5, conf.level=.95) # Low-MCDI kids are above chance at trial 16 #p-value = 0.03906
# 
# binom.test(x=8, n=15, p=.5, conf.level=.95) # High-MCDI kids are at chance at trial 10 #p-value = 1
# binom.test(x=7, n=9, p=.5, conf.level=.95) # High-MCDI kids are at chance at trial 11 #p-value = 0.1797
# binom.test(x=4, n=6, p=.5, conf.level=.95) # High-MCDI kids are at chance at trial 12 #p-value = 0.6875
# binom.test(x=10, n=13, p=.5, conf.level=.95) # High-MCDI kids are marginally above chance at trial 13 #p-value = 0.09229
# binom.test(x=11, n=12, p=.5, conf.level=.95) # High-MCDI kids are above chance at trial 14 #p-value = 0.006348
# binom.test(x=6, n=7, p=.5, conf.level=.95) # High-MCDI kids are at chance at trial 15 #p-value = 0.125
# binom.test(x=8, n=9, p=.5, conf.level=.95) # High-MCDI kids are above chance at trial 16 #p-value = 0.03906
```
\newpage

## AEMs and Age
To assess whether nonverbal prediction varied as a function of age, we repeated the above correlations with infants’ age in months as a factor. We found no correlation between infants’ age and overall proportion of trials with an AEM [r(48)=-0.25, p=0.08]. There was no correlation in Block 1 [r(48)=-0.23, p=0.11] nor in Block 2 [r(47)=-0.22, p=0.13]. Finally, there was no correlation between infants’ age and the proportion of their AEMs to the novel location in Block 2 [r(39)=0.10, p=0.52].
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# summarize data by proportion of AEMs in both blocks
corr.data <- summaryBy(Age + AEM.Any ~ Sub.Num, FUN=c(mean), data=FIPdata)
colnames(corr.data)[1:3] <- c("Subject","Age","Any.AEM")
cor.test(corr.data$Age, corr.data$Any.AEM) #n.s.

# summarize data with proportion of AEMs in Block 1
corr.data <- summaryBy(Age + AEM.Any ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 1"))
colnames(corr.data)[1:3] <- c("Subject","Age","Any.AEM")
cor.test(corr.data$Age, corr.data$Any.AEM) #n.s.

# summarize data with proportion of AEMs in Block 2
corr.data <- summaryBy(Age + AEM.Any ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 2"))
colnames(corr.data)[1:3] <- c("Subject","Age","Any.AEM")
cor.test(corr.data$Age, corr.data$Any.AEM) #n.s.

# summarize data with proportion of AEMs to novel location in Block 2
corr.data <- summaryBy(Age + AEM.Tn ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 2" & AEM.Any==1 & FIPdata$Trial.Num!=9))
colnames(corr.data)[1:3] <- c("Subject","Age","AEM.ratio")
cor.test(corr.data$Age, corr.data$AEM.ratio)
```
\newpage

# Visual Processing (Gap-Overlap) Task

The gap-overlap task allowed us to examine whether individual differences in AEMs were due to differences in visual processing speed. First, we confirmed that infants were slower to shift their eyes to the target on overlap trials when there was a competing stimulus than on gap trials when there was no competing stimulus [gap trials: M=408 ms, SD=31; overlap trials: M=509 ms, SD=74; two-tailed t(42)=-10.4, p<.001].

## Reaction Time (RT) by Condition
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=3, fig.align='center'}
# read in data
GAPdata <- read.csv("visual.processing.task.data.csv", header = TRUE)

# get RTs by subject, trial, condition
RT1 <- ddply((subset(GAPdata,TargetLook==1)),.(Sub.Num,Tr.Num,Condition),summarise,minRT = min(TimeBin100ms))
# summarize RTs by subject and condition, omitting any RTs < 200
RT2 <- ddply(subset(RT1,minRT >= 200),.(Sub.Num,Condition),summarise,meanRT = mean(minRT))
# get AVERAGE RTs by subject
Average.RTs <- ddply(subset(RT2),.(Sub.Num),summarise,Average.RT = mean(meanRT))
# get GAP RTs by subject
Gap.RTs <- ddply(subset(RT2,Condition=="gap"),.(Sub.Num),summarise,Gap.RT = mean(meanRT))
# get OVERLAP RTs by subject
Overlap.RTs <- ddply(subset(RT2,Condition=="overlap"),.(Sub.Num),summarise,Overlap.RT = mean(meanRT))
# combine these
RT.data <- merge(Average.RTs, Gap.RTs)
RT.data <- merge(RT.data, Overlap.RTs)
# get RT difference scores by subject
RT.data$RT.Score <- (RT.data$Overlap.RT - RT.data$Gap.RT)

# merge prediction task data and visual processing task data
FIPandGAPdata <- merge(FIPdata,RT.data)

# calculate mean and standard error for plots
RT.mean <- summaryBy(meanRT ~ Condition, FUN=c(mean), data=RT2)
RT.sd <- summaryBy(meanRT ~ Condition, FUN=c(sd), data=RT2)
RT.var <- summaryBy(meanRT ~ Condition, FUN=c(var), data=RT2)
RT.length <- summaryBy(meanRT ~ Condition, FUN=c(length), data=RT2) #n
RT.se <- sqrt((RT.var$meanRT.var)/(RT.length$meanRT.length))
RT.se <- data.frame(RT.se)

plot.data <- cbind(RT.mean, RT.se)
plot.data$upper <- plot.data$meanRT.mean + plot.data$RT.se
plot.data$lower <- plot.data$meanRT.mean - plot.data$RT.se
colnames(plot.data)[1:3] <- c("Condition","Mean.Reaction.Time","RT.se")

# plot
ggplot(data=plot.data, aes(x=Condition, y=Mean.Reaction.Time, fill=Condition)) +
  geom_bar(stat="identity", color="black") +
  scale_fill_brewer(palette="Paired") +
  geom_errorbar(aes(ymin=lower, ymax=upper), color="black", width=.25) +
  scale_x_discrete() +
  scale_y_continuous(limits=c(0,600), breaks=seq(0, 600, 200), name="Mean Reaction Time (ms)") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black')) +
  theme(legend.text = element_text()) +
  theme(legend.position = "none")

# calculate p-value
t.test(x = RT.data$Gap.RT, y = RT.data$Overlap.RT, paired = TRUE)
```
\newpage

## Correlating Visual Processing Speed and AEMs

Next, we computed RT difference scores for each infant (i.e., overlap trials minus gap trials) and used this difference score as the measure of visual processing speed independent of prediction. We found no correlation between infants’ RT difference scores and their proportion of AEMs overall [r(40)=-0.09, p=0.56] and no correlation between infants’ RT difference scores and their proportion of AEMs to the novel target location in Block 2 of the prediction task [r(33)=0.18, p=0.29]. Together, these findings suggest that infants’ performance on the prediction task was not related to variation in orienting to the dynamics of visual events.

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# proportion any AEM (novel or familiar)
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Any + RT.Score ~ Sub.Num, FUN=c(mean), data=FIPandGAPdata)
cor.test(corr.data$RT.Score.mean, corr.data$AEM.Any.mean)

# ratio in block 2 (novel/(novel+familiar))
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Tn + RT.Score ~ Sub.Num, FUN=c(mean), data=subset(FIPandGAPdata, FIPandGAPdata$AEM.Any==1 & FIPandGAPdata$Block=="Block 2" & FIPandGAPdata$Trial.Num!=9))
cor.test(corr.data$RT.Score.mean, corr.data$AEM.Tn.mean)
```
\newpage

# Additional Analyses (Response to Reviewers)

**We completed a number of additional analyses, which were included in our response to reviewers.**

*Why start the AEM measurement window 200ms BEFORE center fixation image offset? In principle, it seems like you could start the window AT fixation image offset or 200ms AFTER fixation image offset, but that any other selection is going to be arbitrary.*

Behaviorally, infants may not generate anticipatory eye movements (AEMs) with precise timing. Therefore, we suspected that some AEMs might occur prior to the offset of the central fixation. To ensure that we included such eye movements in analyses, we used a 200-ms buffer at both the onset and offset of our AEM window. But your comment prompted us to dig deeper. In the below figure, we plotted infants' looks to the periphery over time. However, very few looks to the periphery (1.2%) occurred in the 200-ms window prior to fixation offset (i.e., between 1300 and 1500 ms). Thus, the decision to include this 200-ms buffer is unlikely to have impacted AEM results. (Also note that the high number of looks to the periphery at the onset of the trial is due to residual looking behavior from the prior trial.) In light of this, we decided not to alter our original plan for window size.

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# summarize looks to target or distractor within each 100-ms timebin
FIPdata$TimeBin <- round((FIPdata$Time+49)/100)*100
FIPdata$peripheral.look <- ifelse(FIPdata$value==.5,0,1)
FIPdata$target.look <- ifelse(FIPdata$value==1,1,0)
FIPdata$distractor.look <- ifelse(FIPdata$value==0,1,0)

sum1 <- summaryBy(peripheral.look ~ TimeBin + Sub.Num + Tr.Num, FUN=c(sum), data=FIPdata)
plot.data <- summaryBy(peripheral.look.sum ~ TimeBin, FUN=c(sum), data=sum1)

ggplot(plot.data, aes(x=TimeBin, y=peripheral.look.sum.sum)) +
  geom_bar(stat="identity") +
  scale_x_continuous(limits=c(-100,3100), breaks=seq(0,3000,500), name="Time from Trial Onset") +
  scale_y_continuous(limits=c(0,2500), breaks=seq(0,2500,500), name="Number of Looks to Periphery") +
  theme_bw(base_family = "Times", base_size=14) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black')) +
  theme(legend.position = 'none') +
  theme(legend.title = element_blank())

# what % of AEMs would we exclude by removing the 200-ms buffer at fixation offset?
sum2 <- subset(sum1, sum1$TimeBin>=1300 & sum1$TimeBin<=2500)
total.AEMs <- sum(sum2$peripheral.look.sum)
sum3 <- subset(sum2, sum2$TimeBin>=1300 & sum2$TimeBin<1500)
early.AEMs <- sum(sum3$peripheral.look.sum)
```
\newpage

*Why no auditory stimuli or moving visual stimuli in the visual processing task? Did this make the task less engaging?*

We decided to use silent, static images, in order to make our design highly comparable to prior studies. Specifically, we felt it appropriate to hold the parameters from Hood and Atkinson (1993) constant. However, there is no reason to believe that this specific design decision was crucial to evaluating visual processing speed. To assess whether this design made the task less engaging, we calculated the proportion of trials included in analysis per subject in each task, and compared the proportions with a Welch t-test. We find that the proportion of trials included in analyses did not vary by task (t(83) = 0.50, p = 0.62), suggesting that the two tasks were equally engaging for infants.

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# infants were not less engaged in the visual processing task as compared to the nonverbal prediction task:
sum1 <- summaryBy(AEM.Any ~ Sub.Num + Age + Tr.Num, FUN=c(length), data=FIPdata)
task1 <- summaryBy(Tr.Num ~ Sub.Num + Age, FUN=c(length), data=sum1)
task1$prop <- task1$Tr.Num.length/16
sum3 <- summaryBy(Time ~ Sub.Num + Tr.Num, FUN=c(length), data=GAPdata)
task2 <- summaryBy(Tr.Num ~ Sub.Num, FUN=c(length), data=sum3)
task2$prop <- task2$Tr.Num.length/24
task1$task <- c("task1")
task2$task <- c("task2")
t.test(task1$prop, task2$prop)
```
\newpage

*I wonder if having a "standardized difference score" for this task (parallel to the standardized MCDI score) might be a better predictor of prediction probability/prediction updating.*

Thank you for this analysis suggestion. To assess this possibility, we transformed each of the RT difference scores into a z-score (i.e., subtracted the mean from every value and divided the difference by the standard deviation), and completed correlations among RTs and AEMs. We find the same pattern of results as with the raw RT difference scores: We find no correlation between infants' RT difference scores and their proportion of AEMs overall [r(40)=-0.09, p=0.56] and no correlation between infants' RT difference scores and their proportion of AEMs to the novel target location in Block 2 of the prediction task [r(33)=0.18, p=0.29]. We are not including these statistics in the manuscript, but would be willing to if you think it is important.

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# z-score RT difference scores
RT.Score.sd <- sd(FIPandGAPdata$RT.Score)*sqrt((length(FIPandGAPdata$RT.Score)-1)/(length(FIPandGAPdata$RT.Score)))
RT.Score.mean <- mean(FIPandGAPdata$RT.Score)
FIPandGAPdata$RT.Score.z <- (FIPandGAPdata$RT.Score - RT.Score.mean)/(RT.Score.sd)

# re-run RT-AEM correlations with z-scored RT difference scores
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Any + RT.Score.z ~ Sub.Num, FUN=c(mean), data=FIPandGAPdata)
cor.test(corr.data$RT.Score.z.mean, corr.data$AEM.Any.mean)

corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Tn + RT.Score.z ~ Sub.Num, FUN=c(mean), data=subset(FIPandGAPdata, FIPandGAPdata$AEM.Any==1 & FIPandGAPdata$Block=="Block 2" & FIPandGAPdata$Trial.Num!=9))
cor.test(corr.data$RT.Score.z.mean, corr.data$AEM.Tn.mean)
```
\newpage

*Why the difference scores? Would RT on the gap trials or on the overlap trials alone have been a better predictor?*

For the sake of brevity in the manuscript, we used an RT difference score (overlap RT minus gap RT) to evaluate whether infants' performance on the visual processing task might explain their performance on the nonverbal prediction task. However - and again not included in the manuscript for the sake of brevity - we also analyzed RTs for gap trials and overlap trials independently. We found that infants' RTs do not correlate with their proportion of AEMs overall [Gap trials: r(40) = -0.21, p = 0.18; Overlap trials: r(40) = -0.17, p = 0.29], nor do infants' RTs correlate with their proportion of AEMs correct in the second block of the nonverbal prediction task [Gap trials: r(33) = -0.10, p = 0.56; Overlap trials: r(33) = 0.13, p = 0.47].

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# proportion any AEM (novel or familiar)
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Any + Gap.RT ~ Sub.Num, FUN=c(mean), data=FIPandGAPdata)
cor.test(corr.data$Gap.RT.mean, corr.data$AEM.Any.mean)

# proportion any AEM (novel or familiar)
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Any + Overlap.RT ~ Sub.Num, FUN=c(mean), data=FIPandGAPdata)
cor.test(corr.data$Overlap.RT.mean, corr.data$AEM.Any.mean)

# ratio in block 2 (novel/(novel+familiar))
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Tn + Gap.RT ~ Sub.Num, FUN=c(mean), data=subset(FIPandGAPdata, FIPandGAPdata$AEM.Any==1 & FIPandGAPdata$Block=="Block 2" & FIPandGAPdata$Trial.Num!=9))
cor.test(corr.data$Gap.RT.mean, corr.data$AEM.Tn.mean)

# ratio in block 2 (novel/(novel+familiar))
corr.data <- summaryBy(Words.Says.Scaled + Age + AEM.Tn + Overlap.RT ~ Sub.Num, FUN=c(mean), data=subset(FIPandGAPdata, FIPandGAPdata$AEM.Any==1 & FIPandGAPdata$Block=="Block 2" & FIPandGAPdata$Trial.Num!=9))
cor.test(corr.data$Overlap.RT.mean, corr.data$AEM.Tn.mean)
```
\newpage

*Any hint that the kids who were excluded for fussiness came from a particular portion of your sample? Low-vocab, young? I always wonder once there are more than a couple who fuss out, especially with a relatively large age range like this.*

We excluded 8 infants (12.5%) from all analyses due to fussiness (i.e., fewer than 50% of their trials were code-able). In response to your comment, we compared age and vocabulary measures for excluded and included infants, and found no differences in age [t(8.58) = 0.06, p = 0.95], MCDI comprehensive vocabulary size [t(11.26) = -0.63, p = 0.54], or MCDI productive vocabulary size [t(11.71) = -0.40, p = 0.695]. We included this information as a footnote on p. 5 of the manuscript.

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
subjectlog <- read.csv("subjectlog.csv", header=T)
excluded.kids <- subset(subjectlog, subjectlog$Include=="NO")
included.kids <- subset(subjectlog, subjectlog$Include=="YES")
excluded.kids <- subset(excluded.kids, excluded.kids$Exlude.Reason=="INATTENTIVE")
t.test(excluded.kids$Age, included.kids$Age)
t.test(excluded.kids$MCDI.Total.Words.Understands, included.kids$MCDI.Total.Words.Understands)
t.test(excluded.kids$MCDI.Total.Words.Says, included.kids$MCDI.Total.Words.Says)
```
\newpage

*The combination of analysis using correlation, the use of standardized vocabulary scores, and the large age range make me wonder about the distribution of standard scores across ages. Do you have a good distribution of standard vocab scores in, say, each quartile of your age range?*

To address this question, we plotted the distribution of MCDI scores in each quartile of our age range. While the 18-20-month-olds show slightly lower standard vocabulary scores, as you can see below, the overall range is comparable across quartiles.
```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# distribution of vocab across age
vocab.summary <- summaryBy(Words.Says.Scaled ~ Sub.Num + Age, FUN=c(mean), data=FIPdata)
vocab.summary$Age.Group <- ifelse(vocab.summary$Age>=12 & vocab.summary$Age<=14, "12-14",
                                  ifelse(vocab.summary$Age>=15 & vocab.summary$Age<=17, "15-17",
                                        ifelse(vocab.summary$Age>=18 & vocab.summary$Age<=20, "18-20",
                                               ifelse(vocab.summary$Age>=21 & vocab.summary$Age<=24, "21-24",NA))))
vocab.summary$Age.Group <- as.factor(vocab.summary$Age.Group)

# boxplot
ggplot(vocab.summary, aes(x=Age.Group, y=Words.Says.Scaled.mean)) +
  geom_boxplot() +
  scale_x_discrete(name="Age in Months") +
  scale_y_continuous(limits=c(0,100), breaks=seq(0,100,25), name="Productive Vocabulary Percentile (MCDI)") +
  theme_bw(base_family = "Times", base_size = 14) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank())

# density plot
ggplot(vocab.summary, aes(x=Words.Says.Scaled.mean, fill=Age.Group)) +
  geom_density(alpha=.3) +
  scale_x_continuous(limits=c(0,100), breaks=seq(0,100,25), name="Productive Vocabulary Percentile (MCDI)") +
  theme_bw(base_family = "Times", base_size = 14) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank())
```
\newpage

*The negative r's for age and likelihood of an AEM (though non-significant) are interesting. Were older kids less engaged by the task?*

To answer this question, we correlated each infant's proportion of trials included in analyses with their age in months. We found no correlation between age and number of trials included [r(48) = -0.06, p = 0.65], suggesting that older infants were not less engaged in the task. We appreciate that you generated many careful ideas for further data analysis.

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
sum1 <- summaryBy(AEM.Any ~ Sub.Num + Age + Tr.Num, FUN=c(length), data=FIPdata)
corr.data <- summaryBy(Tr.Num ~ Sub.Num + Age, FUN=c(length), data=sum1)
corr.data$prop <- corr.data$Tr.Num.length/16
cor.test(corr.data$Age, corr.data$prop)
```
\newpage

*I am not clear why the authors did not measure children’s proportion of AEMs to the correct location for trials 2-8. It is clear why looking at proportion of AEMs to the novel location on trials 10-16 provides us with a measure of children’s updated anticipations, but it seems it would be helpful to have a baseline for low vs. high vocabulary children on how well they learn to make the primary anticipations that one would assume they are making in the first block of trials.*

For the sake of brevity in the manuscript, we reported minimal results from Block 1. Specifically, on p. 9, we reported the null correlation between infants' proportion of trials with an AEM in Block 1 and vocabulary size [r(48)=-0.03, p=0.82]. However, addressing your question, we analyzed the proportion of infants' AEMs to the correct location in Block 1 (trials 2 through 8, excluding infants who did not make any AEMs) and again found a null correlation between infants' Block 1 AEMs and vocabulary size [r(42)=0.24, p=0.12]. We also find that the vast majority of infants' AEMs in Block 1 (189 of 193 total AEMs, 98%) were to the correct peripheral target location, indicating a ceiling effect. Lastly, we also analyzed AEMs trial-by-trial in Block 1, and found no differences in performance among high-vocabulary and low-vocabulary infants. These analyses are all included in supplementary materials, and we also added a sentence on p. 9-10 of the manuscript to address this comment.

```{r, echo=FALSE, message=FALSE, warning=FALSE, fig.width=6, fig.height=4, fig.align='center'}
# summarize data with proportion of AEMs correct in Block 1
corr.data <- summaryBy(Words.Says.Scaled + AEM.Tn ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 1" & AEM.Any==1))

# clean up headers and plot
colnames(corr.data)[1:3] <- c("Subject","MCDI","correct.AEM")
ggplot(data=corr.data, aes(x=MCDI, y=correct.AEM)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-.1,1.1), breaks=seq(0, 1, .1), name="Proportion of Trials\nwith a Correct AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$correct.AEM) #n.s.

# summarize data by count of AEMs in Block 1
MCDI.data <- summaryBy(Words.Says.Scaled ~ Sub.Num, FUN=c(mean), data=subset(FIPdata, FIPdata$Block=="Block 1" & AEM.Any==1))
AEM.data <- summaryBy(AEM.Tn ~ Sub.Num + Tr.Num, FUN=c(sum), data=subset(FIPdata, FIPdata$Block=="Block 1" & AEM.Any==1))
AEM.data$AEM <- ifelse(AEM.data$AEM.Tn.sum >0,1,0)
AEM.data <- summaryBy(AEM ~ Sub.Num, FUN=c(sum), data=AEM.data)
corr.data <- merge(MCDI.data, AEM.data)

# clean up headers and plot
colnames(corr.data)[1:3] <- c("Subject","MCDI","AEM.count")
ggplot(data=corr.data, aes(x=MCDI, y=AEM.count)) +
  geom_smooth(method="lm", se=FALSE, color="black") +
  geom_point(position="jitter",size=5) +
  scale_x_continuous(limits=c(-1,101), breaks=seq(0, 100, 10), name="Productive Vocabulary Percentile (MCDI)") +
  scale_y_continuous(limits=c(-1,8), breaks=seq(0,7,1), name="Number of Trials\nwith a Correct AEM") +
  theme_bw(base_family = "Times", base_size = 16) +
  theme(panel.grid.minor = element_blank()) +
  theme(panel.grid.major = element_blank()) +
  theme(panel.border = element_rect(size=1, color='black'))

cor.test(corr.data$MCDI, corr.data$AEM.count) #n.s.

# how many incorrect AEMs did infants make in Block 1 and what proportion of AEMs is this?
AEM.data <- summaryBy(AEM.Any + AEM.Tn ~ Sub.Num + Tr.Num, FUN=c(sum), data=subset(FIPdata, FIPdata$Block=="Block 1" & FIPdata$Tr.Num>=2))
AEM.data$AEM.Any.sum <- ifelse(AEM.data$AEM.Any.sum>0,1,0)
AEM.data$AEM.Tn.sum <- ifelse(AEM.data$AEM.Tn.sum>0,1,0)

total.AEMs <- sum(AEM.data$AEM.Any.sum) #193
correct.AEMs <- sum(AEM.data$AEM.Tn.sum) #189
# 4 AEMs out of 193 (2%) were to the incorrect peripheral location
```